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ABSTRACT 

Loglinear  models  are  adapted  for  the  analysis  of  multivariate  social  networks,  a  set  of 
sociometric  relations  among  a  group  of  actors.  Models  that  focus  on  the  similarities  and 
differences  between  the  relations  and  models  that  concentrate  on  individual  actors  are  discussed. 
This  approach  allows  for  the  partitioning  of  the  actors  into  blocks  or  subgroups.  Some  ideas 
for  combining  these  models  are  described,  and  the  various  models  and  computational  methods 
are  applied  to  the  analysis  of  data  for  a  corporate  interlock  network  of  the  25  largest 
organizations  in  Minneapolis/St.  Paul  and  for  a  classic  network  of  eighteen  monks  in  a 
cloister. 


Key  Words:  Loglinear  model;  Directed  graph;  Social  network;  Sociometric  data;  Iterative 
proportional  fitting;  GLIM  model 
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1.  INTRODUCTION 

Sociometric  relations  are  typically  defined  for  a  set  of  social  actors.  A  social  network  is  a 
construct  describing  these  actors  and  the  various  relations  that  exist  among  them.  As  used  in 
the  social  sciences,  actors  have  been  individuals  in  groups,  organizations,  cities,  or  even  nation 
states;  relations  have  ranged  from  kinship  to  friendship  to  transfers  of  scarce  resources  to 
corporate  board  of  directors  interlocks. 

Moreno  (1934)  was  the  first  social  scientist  to  study  individual  networks  in  a  systematic 
manner,  and  was  apparently  the  first  network  researcher  to  use  mathematics.  Much  of  his 
terminology,  including  such  phrases  as  "sociogram,"  sociomatrix,"  and  "sociometric  test,"  is  still 
in  use  today.  Festinger  (1949)  and  Katz  (1947,  1933,  1933)  developed  Moreno’s  ideas,  focussing 
on  matrix  representations  of  sociometric  data,  the  popularity  of  actors,  mutuality  of 
relationships  in  social  groups,  and  even  the  representation  of  interpersonal  relations  as  a 

stochastic  process  (see  Katz  and  Proctor,  1939).  Formal  graph  theory,  as  reviewed  in  this 

* 

paper  in  Section  2,  was  introduced  to  social  network  research  by  Cartwright  and  Harary  (1938), 
in  an  attempt  to  quantify  the  social  psychological  theories  of  Heider  (1938). 

Since  these  pioneering  efforts,  sociologists,  social  psychologists,  and  social  anthropologists  have 
repeatedly  used  the  social  network  paradigm.  Davis  and  Leinhardt  (see  Davis,  1970)  scanned 
the  "sociometry"  literature  and  found  nearly  900  examples  of  social  networks  from  diverse 
small  groups.  Since  1970,  social  network  analysis  has  grown  rapidly  in  popularity.  Leinhardt 
(1977)  presents  a  collection  of  twenty-four  previously  published  papers  which  provide  an 
historical  perpsective  on  social  network  analysis,  and  a  collection  of  papers  in  a  volume  edited  , 
by  Holland  and  Leinhardt  (1979)  summarizes  the  state-of-the-art  as  of  about  1973.  Burt 
(1980)  discusses  more  recent  sociological  developments,  Wasserman  (1978)  reviews  alternative 
mathematical  models  for  small  group  behavior,  and  Frank  (1981)  summarizes  some  of  the 
statistical  theory  on  random  graphs.  Almost  none  of  this  research  on  the  analysis  of  social 
networks  has  appeared  in  statistical  journals,  with  the  exception  of  some  of  the  work  by  Katz 


and  Wasserman  (1980).  There  are  just  a  few  papers  with  substantial  statistical  content 

In  a  landmark  statistical  paper  for  network  analysis,  Holland  and  Leinhardt  (1981)  proposed 
an  exponential  family  of  probability  distributions  for  the  analysis  of  a  single  sociometric 
relation.  Fienberg  and  Wasserman  (1981a)  discussed  simple  computational  procedures  for  fitting 
these  models,  and  proposed  some  extensions  to  model  networks  in  which  the  actors  fall  into 
natural  subgroups.  These  distributions  include  parameters  that  relate  characteristics  of 
individual  actors  (e.g.,  popularity)  to  differential  rates  for  entering  into  or  severing  sociometric 
relations.  In  Fienberg,  Meyer,  and  Wasserman  (1981),  we  described  a  related  class  of  models 
for  multiple  relations,  extending  Holland  and  Leinhardt’s  family  to  more  than  one  relation  by 
focussing  on  the  associations  among  the  relations  rather  than  on  influences  of  individual  actors. 
Here  we  bring  these  two  types  of  analyses  together,  and  present  some  "combined  models”  for 
the  analysis  of  multivariate  directed  graphs.  These  models  incorporate  actor  and  subgroup 
parameters,  and  quantities  to  measure  the  degree  of  interrelatedness  of  the  different  relations. 

Methods  to  study  a  multivariate  directed  graph  which  focus  solely  on  the  relations  and 
ignore  individual  social  actors  are  forms  of  macroanalysis.  Data  for  such  analyses  consist  of 
aggregate  counts  of  the  different  structural  patterns  which  occur  within  the  network.  The 
methods  for  studying  local  structure  in  a  network  by  using  the  triad  census  (Holland  and 
Leinhardt,  1975;  Wasserman,  1977)  can  be  labelled  macroanalytic.  Alternatively,  we  could  study 
the  attributes  of  the  actors,  and  how  these  attributes  affect  the  existing  ties  between  them. 
Such  a  study  is  a  microanalysis ,  and  promises  a  more  fine-grained  investigation. 

Both  the  macroanalysis  and  microanalysis  approaches  have  substantive  value.  A  macroanalysis 
of  a  group  centers  on  the  global  structure  of  its  relations,  asking  questions  such  as:  Which 
relation  exhibits  the  strongest  "reciprocity,”  or  is  most  likely  to  have  symmetric  flows?  Are 
there  any  "multiplex”  patterns,  flows  of  different  relations  in  the  same  direction?  Are  there 
any  patterns  of  "exchange,”  in  which  a  flow  in  one  direction  for  one  relation  is  reciprocated 
by  a  flow  in  the  opposite  direction  for  a  different  relation?  Are  there  any  higher  order 
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interactions,  involving  three  or  more  flows  for  two  or  more  relations? 

A  microanalysis  of  a  group  is  a  local  study,  turning  attention  to  the  level  at  which  data  are 
actually  gathered.  Most  microanalyses  have  been  limited  to  groups  with  data  on  just  a  single 
relation.  The  primary  concern  of  such  studies  is  the  individual  group  member  Which  actors 
have  the  most  prestige  or  popularity?  Which  actors  are  involved  in  many  relations,  which  in 
few?  Do  actors  enter  into  mutual,  symmetric  relationships  at  different  rates?  Such  questions, 
while  concerned  with  individual  actor  effects,  are  often  answered  by  examining  dyadic  or 
triadic  relationships. 

As  an  example,  we  consider  the  now  classic  study  of  18  monks  in  an  isolated  American 
monastery,  conducted  by  Sampson  (1969)  and  partially  analyzed  by  Holland  and  Leinhardt 
(1981),  Breiger  (1981),  and  many  others.  Sampson  studies  four  types  of  relations:  Affect, 
Esteem,  Influence,  and  Sanction.  Actors  were  asked  to  give  three  positive  choices  —  e.g., 
which  three  brothers  do  you  like  best  (positive  affect)  —  and  three  negative  choices  —  e.g., 
which  thTee  brothers  are  you  most  antagonistic  towards  (negative  affect)  —  for  each  of  the 
four  types.  In  this  way,  data  were  gathered  on  eight  relations:  (1)  Like  and  (2)  Antagonism 
(Affect),  (3)  Esteem  and  (4)  Disesteem,  (5)  Influence  and  (6)  Negative  Influence,  and  (7)  Praise 
and  (8)  Blame  (Sanction). 


and  arrange  these  data  into  8  binary  sociomatrices,  X  =  (g^.x, . xg),  each  of  dimensions  18  x 

18.  Versions  of  these  arrays  are  given  in  Table  1,  where  the  rows  and  columns  have  been 
permuted  to  reflect  constructed  subgroupings  of  the  actors.  Since  the  1’s,  2's  and  3’s  in  Table 
1  refer  to  order  of  choices,  we  set  all  non-zero  entries  equal  to  1  to  obtain  binary 
sociomatrices. 


TABLE  1 

Sampson's  (1969)  Data 
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Holland  and  Leinhardt  (1981)  used  the  "Like"  relation  to  illustrate  their  new  methods.  Other 
researchers  (White,  Boorman,  and  Breiger,  1976;  Brieger,  Boorman,  and  Arabie,  1975)  have 
studied  all  the  relations,  but  in  a  non-statistical  attempt  to  aggregate  the  18  monks  in  a 
substantively  meaningful  manner.  In  later  sections  of  this  paper,  we  analyze  a  version  of  the 
network  which  aggregates  over  positive  and  negative  affects,  searching  for  both  macro-  and 
micro-models  that  provide  good  statistical  descriptions  of  the  relationships  among  the  actors. 


Most  sociometric  research,  both  empirical  and  mathematical,  is  preoccupied  with  overly 
simplistic  descriptions  of  group  structure.  This  is  very  apparent  in  Burt’s  (1980)  review.  The 
goal  of  this  paper  is  to  build  upon  the  ideas  of  Holland  and  Leinhardt  to  develop  models  for 
the  simultaneous  macro-  and  micro-analysis  of  multiple  relational  networks.  These  models  aid 
in  the  formulation  and  testing  of  theories  concerning  group  dynamics.  In  the  next  section  we 
review  Holland  and  Leinhardt’s  model  and  our  extensions  of  it,  and  then  in  Section  3  we 
illustrate  these  ideas  in  an  analysis  of  a  1976  corporate  interlock  network  from  the  Twin  Cities 
(Minneapolis/St.  Paul).  We  emphasize  the  many  substantive  findings  that  can  be  obtained 
from  this  form  of  statistical  modelling.  In  Section  4,  we  present  several  models  for  the 
analysis  of  data  from  multivariate  directed  graphs,  and  conclude  by  demonstrating  these  ideas 
on  Sampson’s  network. 
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2.  BACKGROUND:  MODELS  FOR  SINGLE  RELATIONAL  DATA 


A  directed  graph,  or  digraph,  consists  of  a  set  of  g  nodes,  and  sets  of  directed  arcs  or 
"choices"  connecting  pairs  of  nodes.  Digraphs  are  natural  mathematical  representations  of 
social  networks,  where  the  nodes  represent  individuals,  organizations  or  other  social  actors,  and 
the  arcs  represent  relations:  directed  attitudes,  feelings,  or  transfers,  such  as  friendship.  A 
digraph  is  frequently  summarized  by  g  X  g  sociomatrices,  X,  one  for  each  of  the  R  defined 

*vr 

relations.  The  g  diagonal  terms  of  each  sociomatrix,  Xjif,  are  defined  to  be  zero. 


First  consider  a  digraph  with  a  single  relation,  R  =  1.  The  row  total,  X.+,  is  referred  to  as 
the  out-degree  of  node  i,  and  the  corresponding  column  total,  X^.  as  the  in-degree  of  node 
i.  A  matrix  x  can  be  thought  of  as  the  realization  of  a  matrix  of  random  variables,  X,  where 
we  assume  that  the  (  *  )  pairs  or  dyads. 


®ij  ~  *  *  i* 

are  independent  bivariate  random  variables,  with  2*  =  4  possible  realizations,  only  3  of  which 


are  distinguishable: 


'(l.l)  :  mutual 

(1,0)  or  (0,1)  :  asymmetric 

(0,0)  :  null. 


A  multivariate  directed  graph,  or  multigraph,  is  described  by  a  collection  of  random 
sociomatrices  X  -  ($,'$2.—.$R).  and  we  assume  that  the  (  *  )  dyads. 


\XiJR’Xji*/ 

are  independent  2R-variate  random  variables  with  22R  possible  realizations.  For  both  digraphs 


i 


i 


and  multigraphs,  the  assumption  that  the  dyads  are  independent  random  variables  is  a  crucial 
one,  and  is  not  subject  to  examination  by  the  framework  developed  in  this  paper. 

Holland  and  Leinhardt  (1981)  introduced  a  class  of  models,  labelled  p[t  to  model  micro¬ 
behavior  in  a  social  group,  on  which  only  one  relation  has  been  defined.  We  now  describe 
these  models,  and  explain  how  their  analysis  can  be  accomplished  by  using  standard 
computational  approaches  to  the  analysis  of  loglinear  models  for  categorical  data.  We  then 
outline  some  extensions  of  these  models  that  allow  for  grouping  of  individual  actors.  Further 
details  can  be  found  in  Fienberg  and  Wasserman  (1981a).  In  Section  4  we  extend  this 
approach  to  the  analysis  of  multigraph  data. 


Consider  a  network  of  g  nodes  and  a  single  relation,  and  represent  the  sociomatrix  X  as  a 


four-dimensional  g  X  g  x  2  x  2  cross-classification  Y  =  (Y  ),  where  the  subscripts  i  and  j 

~  '>kl 

refer  to  the  two  actors  in  a  dyad,  and  k  and  l  refer  to  the  dyad  state: 


V  = 


1.  if  D  =  (X  ,X  .)  =  (k ,1) 

‘j  >j  j‘ 

0,  otherwise. 


(2.1) 


For  example,  Y  ,  =  1  if  D  is  a  mutual  dyad.  Note  that  the  2  x  2  tables  Y  (i*j)  contain 

ijll  ij  "*ij 

one  1  and  three  0’s.  Furthermore,  Y  =  Y  .  and  the  marginal  totals  of  these  2X2  tables 

'H 

..respond  to  indicator  variables  for  X..  and  X  .  Because  each  margin  is  either  (0.1)  or  (1,0), 

•j  j* 

the  interior  of  the  table  is  completely  determined  by  its  marginal  totals. 


We  denote  a  realization  of  Y  by  ^  =  (y,.  ),  and  let 

^  ijk£ 

observation  (k,£)  for  the  dyad  (i,j),  where 


be  the  probability  of  the 
'H 


I  W  =  1,  (12) 


and  we  define  /»  ...  =  log  n  The  Holland-Leinhardt  p  class  of  models  is  as  follows: 

ijk  P  1 


where 


where 

Is 

£ 

II 

V 

+ 

a  +  a  +  /?  +  /?  +  26  +  p.., 
i  j  i  j  ij 

g  g 

1  a  =  1 

ft.  =  o. 

(2.4) 

i=l  ‘  j=l 

j 

and  />=/>. 

ij 

The  sufficient  statistics  for 

the  parameters  of  pf  are  easily  expressed 

as  margins 

of  jr. 

= M- 

Number  of  mutuals. 

yi+1+  =v 

Out-degree  of  node  i. 

(2.5) 

y  =  x  , 

+ji+  +j 

In-degree  of  node  j. 

= 

Total  number  of  choices. 

Through  the  use  of  the  full  £  array,  and  its  redundancies,  one  can  show  that  fitting  p(  to 
the  x  array  is  equivalent  to  fitting  the  "no  three-factor"  interaction  loglinear  model  to  A 

proof  of  this  equivalence  is  given  in  Meyer  (1981).  Thus  we  ean  fit  p)  to  data  by  using  the 
standard  iterative  proportional  fitting  procedure  (IPFP)  applied  to  Furthermore,  the  special 
cases  of  p(,  listed  in  Table  1  of  Holland  and  Leinhardt  (1981),  all  have  equivalent  loglinear 
models  for  and  thus  can  also  be  fit  using  the  standard  IPFP.  The  equivalent  models  are 
given  in  Table  2  of  Fienberg  and  Wasserman  (1981a). 

An  important  generalization  of  p(  starts  with  the  equations  (2.3)  with  constraints  (2.4)  and 
further  postulates  that 

p =  p  +  p,  +  p,,  i  <  j  (2.6) 

where  the  {/>.}  are  normalized  to  sum  to  zero.  The  effect  of  reciprocity  now  depends 
additively  on  the  individual  actors  in  a  dyad,  and  the  {/>.}  measure  the  rates  at  which  actors 
are  likely  to  enter  into  mutual,  symmetric  relationships.  This  model  provides  an  important 
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goodness-of-fit  test  for  p]  (see  Fienberg  and  Wasserman,  1981b)  since  it  contains  p|  as  a 
special  case,  when  />|  =  p2  =  ...  =  Pg  =  0. 

We  now  describe  a  variant  on  p(  for  single  relational  sociometric  data  that  assumes  that  the 
g  actors  have  been  partitioned  into  K  subgroups.  Of  substantive  interest  is  how  likely  it  is 
that  actors  in  one  subgroup  have  relations  with  actors  in  otina  subgroups,  and  how  structurally 
similar  are  actors  in  a  given  subgroup.  We  label  the  subgroups  Gt,G  ,...,GK,  where  the 
partition  of  actors  is  mutually  exclusive  and  exhaustive,  and  assume  that  subgroup  Gg  contains 
gk  actors,  such  that  g(  +  g,  +  ...  +  gR  =  g.  For  example.  White,  Boorman  and  Brieger  (1976) 
(see  also,  Breiger,  1981)  aggregate  the  18  monks  from  Sampson’s  cloister  into  3  "blocks"  or 
subgroups,  containing  gj  =  7,  g^  =  7,  g^  =  4  actors.  This  aggregation  is  reflected  in  Table  1, 
where  the  rows  and  columns  of  the  X  matrices  have  been  rearranged  so  that  the  first  7  rows 
and  columns  refer  to  actors  in  G ,  and  so  forth.  Brieger,  Boorman,  and  Arabie  (1975) 
construct  a  slightly  different  partition.  We  note  that  these  partitions,  called  "bloclcmodels," 
were  accomplished  by  grouping  together  all  actors  that  are  "structurally  equivalent,"  relating  to 
the  other  actors  in  the  group  in  identical  fashion  (see  Lorrain  and  White,  1971). 

We  modify  equations  (2.3)  by  introducing  inter-  and  intra-subgroup  choice  and  reciprocity 
parameters: 

u  =  X(rs) 

”ij00 

H  =  X(n)  +  9{n) 

iJ'°  i  f  G  and  j  <  G  (2.7) 

u  =  x,rs)  +  eM 

'  ijOl 

u  =  x<n)  +  9ln)  +  9is,)  +  o{n) 

”ijll  r 

The  parameters  {0(rs)}  are  choice  effects,  and  the  {/><rs)>,  reciprocity  effects.  The  parameters 
{X(n>}  are  included  to  insure  that  the  y  sum  to  1  for  each  dyad.  One  special  case  of  the 

ijk£ 

subgroup  model  (2.7)  sets  pin)  =  0  for  all  r  and  s.  Holland  and  Leinhardt  (1981)  note  that,  if 
we  further  define 

nln)  =  P(X  =  l|i  «  G  and  j  <  G  ),  (2.8) 

1J  f  s 


« 


12 


r 


then,  in  this  special  case. 


,(r*l 


=  log 


1  -  »(n> 


=  logit  (*'”>). 


(2.9) 


A  second  special  case  of  (2.7)  is  also  a  special  case  of  p(  in  which  we  have  a  simple 
additive  model  for  6{n\  All  actors  in  subgroup  Gf  have  a  common  a,  alr\  and  a  common  p, 
p'\  We  set 

d'n)  =  0  +  aM  +  p*' 


P""  =  /»• 


(2.10) 


This  model  is  equivalent  to  pi  if  K  =  g,  and  is  a  simplification,  in  the  sense  that  we  reduce 
the  number  of  «’s  (and  P’s)  from  g-1  to  K-l. 


For  details  about  these  and  other  generalizations  and  specializations  of  p  ,  and  for  comments 


on  fitting  these  subgroup  models  to  single  relational  data,  see  Fienberg  and  Wasserman  (1981a). 
In  Section  4  we  give  a  multivariate  generalization  of  this  model. 


H 


*  1 


3.  ANALYSIS  OF  A  SINGLE  RELATION  IN  A  CORPORATE  NETWORK 


To  illustrate  these  models,  and  to  present  some  additional  methods,  we  consider  1976  data  on 
a  network  of  the  twenty-five  largest  publicly-owned  corporations  headquartered  in  the  Twin 
Cities  of  Minneapolis  and  SL  Paul.  A  firm  is  included  in  the  network  if  it  is  among  Fortune 
magazine’s  500  largest  industrials,  50  largest  commercial  banks,  50  largest  life  insurance 
companies,  50  largest  financial  companies,  50  largest  retailers,  50  largest  transportation 
companies,  and  50  largest  utilities.  These  companies  are  listed  in  Table  2,  along  with  their 
ranks  and  location. 

A  preliminary  analysis  and  thorough  discussion  of  this  network  is  given  by  Galaskiewicz  and 
Wasserman  (1981).  An  arc  (or  a  "corporate  interlock")  exists  from  firm  i  to  firm  j  if  an 
officer  of  firm  j  is  on  the  corporate  board  of  directors  of  firm  i.  An  interesting  feature  of 
this  network  is  the  exclusion  of  dyadic  interactions  in  which  the  two  firms  of  the  dyad  have 
the  same  Standard  Industrial  Code.  These  "competitive"  dyads  have  been  excluded  because  of 
SEC  anti-trust  regulations  that  prevent  interlocks  between  firms  in  the  same  industry.  There 
are  27  of  these  "structurally  zero"  dyads. 

A  variety  of  models  was  fitted  to  two  versions  of  this  network.  One  version  included  all  25 
firms,  and  the  other  included  only  20  firms,  excluding  four  firms  that  do  not  interact  with  the 
others  (have  zero  in-degrees  and  out-degrees)  —  American  Hoist  and  Derrick,  IDS,  Gamble- 
Skogmo,  and  North  Central  Airlines  —  and  a  firm.  Land  O'Lakes,  which  is  a  cooperative,  and 
hence  not  strictly  publicly  owned.  The  calculation  of  degrees  of  freedom  (df)  is  tricky  because 
of  the  structural  zeros  and  the  zero  in-degrees  and  out-degrees.  In  general  we  follow  an 
approach  similar  to  that  suggested  by  Bishop,  Fienberg,  and  Holland  (1975,  pp.  115-116). 
Below,  we  report  likelihood  ratio  (G2)  statistics  and  degrees  of  freedom  for  just  2  models. 


Fortune  Rank 


Manufacturers 

(1976) 

City 

Minnesota  Mining  &  Manufacuring  (3M) 

56 

SL  Paul 

Honeywell 

67 

Minneapolis 

General  Mills 

84 

Minneapolis 

Control  Data 

170 

Minneapolis 

Pillsbury 

173 

Minneapolis 

Land  O’Lakes 

180 

Minneapolis 

International  Multifoods 

233 

Minneapolis 

Bemis 

318 

Minneapolis 

Peavy 

361 

Minneapolis 

Heorner-Waldorf 

382 

SL  Paul 

American  Hoist  and  Derrick 

434 

St  Paul 

Economics  Laboratory 

SOO 

SL  Paul 

Commerica!  Banks 

Northwest  Bankcorporation 

18 

Minneapolis 

First  Bank  System 

20 

Minneapolis 

Life  Insurance  Companies 

Minnesota  Mutual  Life  Insurance 

41 

SL  Paul 

Northwestern  National  Life  Insurance 

42 

Minneapolis 

Diversified  Financial  Companies 

St  Paul  Companies 

20 

Sl  Paul 

Investors  Diversified  Services  (IDS) 

28 

Minneapolis 

Retailing  Companies 

Dayton  Hudson 

20 

Minneapolis 

Gamble-Skogmo 

22 

Minneapolis 

Transportation  Companies 

Burlington  Northern  R.R. 

10 

SL  Paul 

Northwest  Orient  Airlines 

18 

SL  Paul 

North  Central  Airlines 

48 

Minneapolis 

Soo  Line  R.R. 

49 

Minneapolis 

Utilities 

Northern  States  Power 

28 

Minneapolis 

Table  2.  Twin  Cities  Corporate  Network 
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1 


8 

=  25 

g 

=  20 

* 

Model 

G? 

di 

G* 

df 

* 

p^.p.UU^)) 

186.69 

192 

182.89 

176 

' 

6(p  =  «  =  A  =  0) 

r  *  j 

324.66 

545 

276.46 

341 

As  can  be  seen,  the  very  simple  model  with  a  single  parameter  provides  *fen  adequate 


description  of  both  versions.  This  implies  that  the  actors  in  neither  version  exhibit  differential 
productivity  or  attractiveness,  and  that  there  is  no  tendency  toward  reciprocity.  We  conclude 
that  the  elements  in  X  are  independent  identically-distributed  Bernoulli  random  variables  with  p 
=  P(X..  =  11  and  log  odds  ratio  6  -  log  <p/(l— p)>.  Maximum  likelihood  estimates  (MLEs)  of 
6  are  -2.49  (g  =  25)  and  -2.01  (g  =  20).  This  yields  }  =  0.0906  (g  =  25)  and  }  =  0.1553 
(g  =  20). 

We  now  discuss  some  additional  and  new  methods  for  the  analysis  of  single  relational  data. 
We  first  describe  tests  for  the  adequacy  of  partitions  of  actOTS  into  subgroups,  and  then  show 
how  to  estimate  main  effects  for  and  interactions  betweep  the  discrete  variables  used  to 
partition  the  actors.  These  ideas,  along  with  the  methods  described  in  Holland  and  Leinhardt 
(1981)  and  Fienberg  and  Wasserman  (1981a),  should  provide  a  more  complete  "package"  for 
single  relational  data.  We  intend  the  remainder  of  this  section  to  fill  the  existing  gaps  in  this 
methodology,  and  will  use  the  1976  Twin  Cities  corporate  network  simply  for  illustrative 
purposes. 

Suppose  we  have  two  possible  mutually  exclusive  and  exhaustive  partitions  of  a  set  of  g 

actors,  G  =  (G^  G2 .  GK)  and  H  =  (H^  H2 .  HL>,  such  that  K  <  L.  and  the  G’s  are 

unions  of  the  H’s.  For  example,  let  g  =  6,  and  define  G[  =  {1,2,3),  G2  =  {4,5,61,  and  H  = 
(1,2),  H2  =  {31,  and  Hs  =  (4,5,6);  then,  G(  =  H  U  H2  and  G2  =  H3.  Thus,  G  is  an 
aggregation  of  H. 

We  consider  whether  or  not  to  further  aggregate  the  actors  into  K  subgroups,  assuming  that 
the  actors  are  already  partitioned  into  L  subgroups;  i.e.,  can  we  combine  some  of  the  L 


< 


existing  subgroups  to  form  K  larger  ones?  Note  that  if  L  =  g.  then  we  ask  whether  or  not 
we  should  do  any  aggregation  at  all.  We  test 


Hq  :  P)  applied  to  K  subgroups  is  appropriate 

versus 

Ha  :  p,  applied  to  L  subgroups  is  appropriate 

The  version  of  p]  applied  to  subgroups  is  given  by  equations  (2.6)  and  (2.7).  In  terms  of  the 
model  parameters,  there  are  L-l  each  of  the  «(r)  and  effects  under  Ha  and  K-l  each 

under  Hq.  The  a's  and  ps  for  the  subgroups  that  are  aggregated  under  H0  are  equated. 
Since  H  is  a  special  case  of  H  ,  if  we  assume  that  the  model  under  H  is  correct,  then  the 

"  A  A 

conditional  likelihood  ratio  statistic  G2(Hq|Ha)  =  G2(Ho)  -  G2(Ha),  with  g(g-l)  -  2K  -  tg(g-l) 
-  2L]  =  2(L-K)  degrees  of  freedom  can  be  used  to  test  Hq  versus  Ha.  If  L  =  g,  then  the 
test  statistic  has  2(g  -  K)  degrees  of  freedom. 


For  the  1976  Twin  Cities  corporate  network,  we  focus  on  three  partitions  using  the 
information  in  Table  2: 

G  =  (Gu  =  Mpls.  firms.  G^  *  SL  Paul  firms) 

G2  =  (Gi2  =  Large  firms,  GJ2  =  Small  firms) 

H  =  (H|  =  Large  Mpls.  firms,  H2  =  Large  SL  Paul  firms, 

H3  =  Small  Mpls.  firms,  H4  =  Small  SL  Paul  firms) 

"Size"  of  a  firm  is  determined  by  the  Fortune  ratings:  "large"  firms  rank  among  the  larger  250 
(or  25).  The  25X25X2X2  y  array,  aggregated  to  a  4X4X2X2  array  to  reflect  the  H  partition,  is 
given  as  Table  3.  Note  that  both  G(  and  G?  are  aggregations  of  H. 

The  following  hierarchy  lists  the  three  aggregations  and  gives  the  associated  likelihood  and 
conditional  likelihood  ratio  statistics  for  testing  the  significance  of  aggregations: 
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i 
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1 


H 

H 

H 

H 

LARGE 

2 

SMALL 

4 

Minneapolis 

SL  Paul 

Minneapolis 

SL  Paul 

L 

Minneapolis 

54  7 

31  0 

53  5 

35  2 

A 

7  8 

4  3 

4  1 

1  0 

R 

G 

St  Paul 

31  4 

6  3 

25  0 

11  3 

E 

0  3 

3  0 

0  0 

0  0 

S 

Minneapolis 

53 

4 

25 

0 

38 

2 

25 

0 

M 

5 

1 

0 

0 

2 

0 

0 

0 

A 

L 

SL  Paul 

35 

1 

11 

0 

25 

0 

8 

1 

L 

2 

0 

3 

0 

0 

0 

1 

0 

Table  3.  1976  Twin  Cities  Corporate  Network  Relations  aggregated 
into  4  subgroups  based  on  Location  and  Size 


Aggregation 

G2 

df 

A  G2 

A  df 

pj  -  no  aggregation;  25  actors 

186.69 

192 

H  -  aggregation  by  size  &  location; 

L=4 

401.80 

538 

215.11 

346 

G,  ~  aggregation  by  location;  K)=2 

461.62 

542 

59.82 

4 

G,  ~  aggregation  by  size;  Kg=2 

525.63 

542 

124.01 

4 

Note  that  G2(H|  p()  =  401.80  -  181.54 

=  220.26,  is 

less  than 

the  corresponding  difference  in 

df,  346,  so  that  one  could  argue 

that 

aggregating 

the  25 

actors  into  4 

subgroups  is  not 

necessary.  The  statistic  G2(H)  ~  is  clearly  small,  however,  and  simplicity  of  the  H 
aggregation  is  so  desirable  that  it  is.  a  very  attractive  model.  Both  statistics  G2(G  |  H)  =  59.82 
and  G2(G  |  H)  =  124.01  yield  p-values  less  than  10"4.  so  further  aggregation  is  not  advisable. 


There  is  one  substantial  advantage  in  using  aggregated  versions  of  these  models.  Besides  the 
ease  with  which  the  maximum  likelihood  cell  estimates  can  be  computed  (we  need  only  a  K  X 
K  x  2  X  2  table,  where  K  is  usually  quite  a  bit  smaller  than  g),  the  standard  X2  distributions 


are  more  appropriate  as  reference  distributions  for  the  resulting  test  statistics.  This  is  because 
the  number  of  parameters  (2K  with  the  p] -subgroup  model)  is  fixed  and  does  not  increase  in 
the  limit,  as  g  -»  oo.  There  are  problems  that  arise  in  testing  when  using  models  with 
parameters  for  each  actor  (see  Haberman  (1981)).  Fortunately,  these  problems  are  attenuated 
when  acton  are  aggregated. 

In  the  following  section  we  generalize  this  approach  to  the  case  of  multiple  relations. 


4.  MODELS  FOR  MULTIPLE  RELATION  DATA 

We  now  turn  our  attention  to  networks  of  actors  on  whicb  several  relations  are  defined. 
We  discuss  three  types  of  models:  (1)  Models  with  neither  actor  nor  group  parameters;  (2) 
models  with  only  group  parameters;  and  (3)  models  with  both  actor  and  group  parameters. 
The  first  type  is  a  family  of  models  for  the  macroanalysis  of  the  multiple  relations  that 
ignores  any  differences  between  actors.  These  models  are  briefly  described  in  Fienberg,  Meyer 
and  Wasserman  (1981).  and  were  used  implicitly  by  Galaskiewicz  and  Marsden  (1978)  to  study 
resource  flows  between  organizations  in  a  midwestern  community. 

The  most  useful  models  for  multiple  relations  are  those  that  include  parameters  to  reflect 
different  choice  tendencies  of  the  actors,  particularly  when  they  have  been  partitioned  into 
groups.  If  each  group  is  a  singleton,  then  we  have  a  different  set  of  parameters  for  each 
actor;  however,  in  practice  this  is  likely  to  be  a  very  large  number.  Thus,  the  assumption  of  a 
specific  partition,  chosen  as  a  consequence  of  extra-relational  information,  allows  us  to 
parsimoniously  limit  the  number  of  parameters,  and  (as  is  the  case  with  single  relational  data) 
use  standard  X2  asymptotic  distributions  for  testing. 

The  last  type  of  model  is  a  generalization  of  the  family  of  models  for  multiple  relational 
data  sets  in  which  the  actors  have  been  partitioned  into  mutually  exclusive  and  exhaustive 
groups.  The  assumption  that  all  actors  in  a  specific  group  relate  to  actors  in  other  groups  and 
to  other  actors  in  the  same  groups  in  identical  ways  may  not  always  be  the  case.  There  may 
be  subtle  individual  differences  among  the  actors  in  a  subgroup.  Thus,  the  third  type  of 
models  allows  us  to  add  individual  actor  parameters  to  study  these  differences  to  the  second 
type  of  models  with  just  group  parameters. 

We  conclude  this  paper  by  illustrating  these  models  on  Sampson’s  network  of  18  monks,  for 
which  we  have  4  positive  relations  and  4  negative  relations,  and  three  subgroups,  empirically 
determined  by  the  use  of  clustering  algorithms. 
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4.1  Models  for  the  Macroanalysis  of  Multiple  Relations 


In  order  to  model  the  macro-aspects  of  multigraphs  we  need  to  develop  a  notation  for  the 
22R  possible  realizations  of  the  (D..1  and  a  representation  for  the  table  of  summary  counts  of 
these  realizations  obtained  by  adding  across  dyads.  Since  these  models  assume  no  individual 
actor  differences,  the  sufficient  statistics  for  the  model  parameters  are  margins  of  this  table. 


Table  4  contains  summaries  of  Sampson’s  data,  shown  in  Table  1,  in  the  form  of  two  2s 
tables  of  counts  of  pairs  of  monks,  one  for  the  four  positive  relations  and  the  other  for  the 
four  negative  relations.  Within  each  table,  each  pair  is  counted  twice,  once  from  the 
perspective  of  each  member,  yielding  a  total  count  of  2  X  (  '*  )  =  306.  We  refer  to  these 
tables  as  w-arrays,  with  entries  {w. . . .}.  Here  R  =  4. 

There  are  other  ways  to  arrange  these  summary  counts  in  tabular  form.  One  way  eliminates 
the  cells  which  occur  twice.  In  general,  a  22R  w-array  contains  2R_1  (2R  +  1)  unique  cells. 
Among  these,  are  2R  cells  whose  counts  are  duplicated;  i.e.,  occur  twice  in  w.  If  we  eliminate 
the  doubling  and  duplication  in  the  8-dimensional  w-arrays  given  in  Table  4,  we  get  two 
arrangements  of  136  cells,  whose  counts  correctly  total  153.  In  Table  5,  we  give  one  possible 
arrangement  of  these  136  cells  in  a  form  resembling  a  four  dimensional  3  X  3  x  3  X  3  cross- 
classification  ,  in  which  some  of  the  81  cells  have  more  than  1  count  We  denote  the  counts  in 
Table  5  by  z  =  Iz  :  a,b,c,d  =  M.A.A.N}  (the  use  of  the  subscripts  A  and  A  is  described  in 
the  caption  to  the  table). 

We  wish  to  model  p  ,  the  probability  that  a  randomly  selected  dyad  would  be  assigned  to 

abed 

cell  (a,b,c,d)  of  Table  5,  where 


•  Z 

all  cells 


p  *  1. 

'•bed 


(4.1) 


Table  4 


Sampson’s  Cloister  Data  Aggregated  Over  Actors 

(a)  the  order  of  the  variables  is  (like,  esteem,  influence,  praise) 
with  the  index  on  the  first  variable  changing  fastest 


180  6626  21061  2000 

41003  80100  0000 
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00000  00000  0000 
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00000  00001  0000 
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(b)  the  order  of  the  variables  is  {antagonism,  disesteem.  neg  influence,  blame) 
with  the  index  in  the  first  variable  change  fastest 
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We  define 


a 


(  =  > 
*  abed 


log  p  ,  if  a,b,c,  and  d  are  each  equal  to 
*  either  M  or  N 

log  (ptbe<)/2),  if  one  of  a,b,c,  or  d  equals  A  . 


(4.2) 


and  we  develop  a  class  of  linear  models  for  the  {f>bcd)  which  yields  an  affine  translation  of  a 


class  of  foglinear  models  for  the  (pabcd).  The  reasons  for  this  approach  are  discussed  by 
Fienberg,  Meyer,  and  Wasserman  (1981);  primarily,  we  introduce  the  factor  of  te  for  A  cells  to 
make  our  models  consistent  with  the  univariate  model  of  Holland  and  Leinhardt  (1981). 


The  models  for  the  U  1  are  linear  in  sets  of  parameters  that  reflect  the  various  distinct 
types  of  dyadic  patterns.  In  Fienberg,  Meyer,  and  Wasserman  (1981)  we  considered  R  =  3 
relations  as  displayed  in  Figure  1.  The  {llbe)  were  modeled  with  up  to  36  (one  per  cell  in  z) 
parameters  with  hierarchical  structure  reflecting  13  distinct  types.  When  R  =  2.  there  are  only 
7  distinct  types  and  at  most,  10  parameters  are  necessary.  When  R  =  4,  there  are  22  distinct 
types  and  81  parameters. 

FIGURE  1  PATTERNS  OF  FLOW  DEPENDENCY  IN  DYADIC  PATTERNS 
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Table  5 


Nonredundant  Arrangement  of  Cells  for  Positive  Relations  from  Sampson’s  Date 

(see  Table  4a) 

Praise 


M 

Influence 


^  zm^imm  zm^iam  zm^«nm 
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7  7  7  _  7 
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A 
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z  — 

A^AA 

ZAgNA 
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z  -  z  — 
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N 

Influence 
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ZA^MN  ZAgAN  ZA^AN  Z, 


ZAgMN  ZA^AV  ZAgAN  Za, 
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^  ZN^MM  ZNV^M  ZN^NM 
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ZN^MA  ZN»j«AA  ZN^AA  ZNJ^NA 
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The  parameters  in  this  amily  of  models  are  GLIM-like  in  structure  (see  Nelder  and 
Wedderburn.  1972).  A  parameter  is  included  in  the  model  if  and  only  if  the  corresponding 
effect  (such  as  choice,  conditional  multiplexity,  etc.)  is  present  The  parameters  are  also 
hierarchical:  if  we  set  some  parameters  equal  to  zero,  all  related  higher-order  terms  are  also 
zero. 

To  fit  these  models  to  multivariate  networks,  we  apply  the  general  results  for  fitting 
loglinear  models  given  in  Haberman  (1974)  or  Appendix  II  of  Fienberg  (1980).  The  minimal 
sufficient  statistics  (MSS’s)  are  linear  combinations  of  the  elements  of  the  z  array,  with 
coefficients  of  0,  1,  or  2.  The  fitted  values  of  these  elements  are  found  by  solving  the 
likelihood  equations,  which  set  the  MSS’s  equal  to  their  estimated  expected  values.  We  can 
either  use  a  version  of  generalized  iterative  proportional  fitting  due  to  Darroch  and  Ratcliff 
(1972),  or  a  "trick,"  given  in  Fienberg.  Meyer,  and  Wasserman  (1981),  which  relies  on  the 
following  two  results: 

Result  1:  For  the  class  of  affine  translations  of  hierarchical 

loglinear  models  described  above,  each  set  of  MSS’s  is 
equivalent  to  a  set  of  marginal  totals  for  the  22R  table 
(i.e„  the  w-table)  with  doubled  and  duplicated  counts. 

Result  2:  For  each  affir.s  translation  of  a  loglinear  model  for  the 
z -table,  there  is  a  corresponding  loglinear  model  for  the 
w-table,  with  equivalent  estimated  expected  values, 
once  we  take  account  of  the  duplication  and  doubling. 

The  estimated  expected  values  for  the  elements  of  the  w-array  can  be  computed  using  the 
standard  IPFP  and  the  estimates  of  the  parameters  calculated  from  the  fitted  values.  We  note 
that  the  degrees  of  freedom  for  any  model  must  be  calculated  using  the  model  for  the  z- 
array,  and  values  of  goodness-of-fit  statistics  computed  using  the  w-array,  dividing  by  2  to 
adjust  for  the  doubling  and  duplications. 


4.2  Models  for  Both  Microanalysis  and  Macroanalysis:  Actor  and  Group  Effects 

We  now  consider  models  for  multiple  relations  that  allow  the  actors  in  the  network  to 

engage  in  relations  at  possibly  different  rates,  and  include  both  actor  and  group  effects.  To 
review,  we  suppose  that  the  R  sociometric  relations  defined  for  a  group  of  g  actors,  are 

binary,  and  the  presence/absence  of  directed  links  between  actors  is  recorded  in  the  form  of 
R  sociomatrices.  As  before,  we  concentrate  on  the  dyadic  relationships  between  the  (  *  )  pairs 

of  actors  i  and  j,  represented  by  the  2R-variate  D  ,  with  realization  d... 

Primarily  to  limit  the  number  of  parameters,  we  now  assume  that  the  actors  have  been 

partitioned  into  K  mutually  exclusive  and  exhaustive  subgroups,  G2 .  GR.  In  practice,  it 

is  very  useful  to  allow  for  the  inherent  differences  in  the  actors  in  this  manner.  If  there  are 
single  actors  that  behave  contrary  to  the  group  as  a  whole  (or  to  the  collection  of  subgroups), 
then  they  can  be  placed  into  their  own  singleton  subgroups.  Thus,  their  individual  differences 
can  still  be  modeled  directly. 

In  this  section  we  outline  models  which  can  include  both  actor  and  group  effects.  These 
models  contain  all  the  previous  models  as  special  cases.  The  R  sociomatrices  are  used  to 
construct  a  table  of  pseudo-counts,  of  size  g  X  g  X  (2  X  2)R.  From  this  multivariate  version 
of  the  £ -array,  we  can  aggregate  (2  X  2)R  tables  to  form  a  K  x  K  x  (2  X  2)R  table,  whose 
entries  are  the  frequencies  of  the  different  dyadic  relationship  patterns  between  actors  of  a 
group  partitioned  into  K  subgroups.  As  in  the  earlier  cases  it  is  most  convenient  to  work  with 
the  full  gxg  X  (2X2)R  data  table  but  to  describe  models  in  terms  of  the  unduplicated  data 
array.  This  approach  also  grants  us  a  considerable  degree  of  flexibility  in  fitting  the  models. 
Fot  many  of  the  models  it  is  possible  to  consider  collapsed  or  aggregated  versions  of  the  data 
which  would  result  in  smaller  data  tables.  We  believe  that  the  unification  which  is  introduced 
by  always  considering  the  full  data  table  outweighs  the  occassional  advantage  of  having  a 
smaller  table. 


We  will  begin  our  discussion  by  concentrating  on  the  choice  parameters  in  an  R  =  3  relation 
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network.  As  a  starting  point  we  contemplate  the  model 


log  PCD  =  d  )  =  X“j>  +  I*  0  (‘j'  X  +  IR  0(ii\  .  for  all  i  >  j.  (4.3) 

r-i  y  ljr  r-l  r  jir  J 

which  includes  different  choice  parameters  for  each  pair  of  individuals.  The  parameters  Xlij) 
are  normalizing  constants  and  are  required  so  as  to  meet  the  sampling  constraints  of  the 
problem. 


Initially,  we  focus  our  attention  on  just  the  first  relation.  If  we  wished  to  consider  a  model 
which  asserted  that  the  response  depended  only  on  the  chooser  we  would  allow  0W  =  0^. 
Similarly,  dependence  only  on  the  chosen  actor  would  lead  to  0lji)  =  0*j).  Obviously  we  could 
allow  chooser  and  chosen  actor  effects  (but  excluding  the  interaction)  by  specifying  0lii)  = 
tf*'*  +  6®.  Another  version  of  this  model  would  be  to  suppose  that  individual  actors  assert 
influence  only  through  the  groups  to  which  they  belong.  In  this  case  we  could  write  0(ijl  = 
0<y(i,)  for  the  choosing  group  i,  0(®  -  0iyw  for  the  chosen  group  j.  0W  =  0(7{i))  +  0(7fi>)  f0T 
both,  or  even  0^  =  7(i>-  T {j»  to  indicate  group  by  group  choice  interactions.  If  we  aggregate 

over  groups  then  these  models  are  just  "actor"  models  where  the  actors  are  the  groups.  In 
summary  we  have  just  described  four  basic  classes  of  models;  those  which  involve  parameters 
(0t,J>)  for  each  pair  of  actors,  those  with  individual  parameters  (0ti>  and  0{>))  for  actors  and  the 
corresponding  notions,  and  {0<y,,,\  for  groups.  Thus  some  possible  "choice- 

only”  models  for  one  relation  are: 


I  " 

•'/l 


•  4 


•  ‘ 


*  1 


•  ' 
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log  P(D  =  d  )  =  1)  e 

2)  0li> 

3)  0<j’ 

4)  0{i)  +  0('> 

5)  0t,J> 

6)  e(y(i)) 

7)  d{yw 

8)  *‘7«»  + 

9)  0<y<».y<j» 

It  is  possible  to  mix  and  match  among  these 


constant 

chooser 

chosen 

chooser  and  chosen 

interaction 

group  chooser 

group  chosen 

group  chooser  and  chosen 

group  interaction. 

models  to  consider,  for  example,  the  model 


0{i)  +  0<j>  +  which  allows  individual  actor  parameters  and  a  group  interaction.  As 

soon  as  we  contemplate  such  models  we  need  to  note  that  there  is  a  partial  hierarchy  to  the 
models  listed  above,  which  we  represent  in  Figure  2. 


Figure  2.  Hierarchy  Displaying  Levels  of 
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Parameters  in  the  Loglinear  Models 


The  diagram  indicates  that  any  parameter  at  level  i  implies  all  those  parameters  at  levels  less 
than  i.  The  modelling  strategy  we  have  outlined  above  can  be  used  for  other  types  of  flows 
(e.g.  mutual,  reciprocal)  and  multiple  relationships.  In  these  cases  we  need  to  be  concerned 
about  the  hierarchical  structure  between  parameter  types  as  well  as  within  parameter  types. 
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4.3  Fitting  Models 

If  we  restrict  ourselves  to  actor  parameters  then  we  can  fit  the  models  described  above  using 
the  IPFP  to  adjust  simple  margins  of  the  symmetric  g  X  g  X  (2x2)“  data  array.  When  models 
with  group  parameters  are  included  it  is  still  possible  to  use  the  IPFP  but  now  a  more  general 
notion  of  margin  is  needed. 

Let  us  consider  two  relations  and  the  model  log  P(p.,  =  £ ..)  =  +  d(j\  i.e.  a  choice 

parameter  on  only  the  first  relationship.  The  sufficient  statistics  for  this  model  in  the 
symmetric  data  array  are 


[12] 

2 

k£mn 

X  . 

ijkPmn 

for  all  ij, 

[13] 

2 

jlmn 

x  . 

ijk/mn 

for  all  i.k. 

[24] 

2 

ikmn 

X 

ijkPinn 

for  all  j ,1. 

Now  consider  the  model  X,ijl  + 

For 

this  model  the 

sufficient  summary  is 

2 

k£mn 

x 

ijk£mn 

for  all  i.j. 

2 

2 

j£mn 

x 

ijk£mn 

for  d  =  1.....G  and  for  all  k. 

• 

. 

2 

j<Gd 

2 

ikmn 

X  „ 

ijk/mn 

for  d  =  1.....G  and  for  all  l. 

We  extend  the  usual  square  bracket  notation  to  this  situation.  Recall  that  [12]  indicates  that 
for  each  value  of  i  and  j  we  should  sum  over  all  other  dimensions  in  the  table.  We  shall  use 
the  notation  [-1  -2]  to  indicate  that  for  each  G  and  G  we  should  sum  over  all  entries  in  the 

4  e 

table.  A  simple  example  should  help  to  explain  the  notation. 


Consider  the  following  3X3  table: 


1  2  3 

1  a  b  c 

2  d  e  f 

3  g  h  i 

Let  Gj  =  11.2)  and  <?2  =  131.  Then  the  Cl]  margin  is  the  triple  (a+b+c,  d+e+f,  g+h+i),  the 
[-1]  margin  is  the  pair  (a+b+c+d+e+f,  g+h+i)  and  the  [-1  -2]  margin  is  the  table 

a+b+d+e  c+f 

g+h  i 

In  effect  we  have  collapsed  over  the  groups.  It  is  an  easy  application  of  the  IPFP  to  fit 
models  which  use  this  generalized  notion  of  margin.  We  note,  however,  that  most  standard 
packages,  which  contain  an  IPFP  routine,  cannot  be  cajoled  into  fitting  such  models  without 
some  tinkering. 

We  now  use  the  notation  to  show  which  models  correspond  to  certain  parametrizations  for 
R  =  2.  Table  6  lists  some  of  the  choice  models,  and  a  small  selection  of  other  possible  models. 
There  are  many  possible  models  with  many  possible  combinations  of  population,  group,  and 
individual  parameters. 

4.4  An  Example:  Sampson’s  Data 

In  order  to  demonstrate  the  ubiquity  and  apparent  complexity  of  these  social  network 
models,  we  have  taken  a  somewhat  unusual  (and  bold)  approach  to  the  analysis  of  Sampson’s 
network  of  eighteen  monks.  We  view  the  four  positive  attributes  (like,  esteem,  influence,  and 
praise)  as  realizations  of  a  single  positive  affect  process,  and  the  four  negative  relations 
(antagonism,  disesteem,  negative  influence,  and  blame)  in  a  similar  manner.  There  is  substantial 
justification  for  this  pooling.  White,  Boorman,  and  Breiger  (1976)  found  that  when  the 
eighteen  actors  are  aggregated  into  three  blocks,  the  concrete  social  structure  of  this  network  is 
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choice 


mutuality 


multiplex 


reciprocity 


Table  6.  A  Selection  of  Possible  Models  for  R=2 


Parameters 

Margins  to  be  Fit 

0 

i 

C12] 

[3]  [4] 

$<•> 

i 

[12] 

[13]  [24] 

0<i> 

i 

[121 

[14]  [23] 

0UI  + 

I  i 

[12] 

[133  [14]  [233  [243 

£</<»> 

i 

[12] 

[-13]  [-24] 

0<Y<  j» 

i 

[12] 

[-14]  [-23] 

0<mr<i» 

i 

[12] 

[-1  -2  3]  [-1  -2  4] 

Pw 

rn 

[12] 

[1341  [234] 

yr«» 

r22 

[12] 

[-156]  [-256] 

o<Y'  ».r«j» 

rn 

[12] 

[-1  -2  34] 

0 «» 

12 

[12] 

[135]  [246] 

0<i> 

12 

[12] 

[-146]  [-235] 

0<r<».y<j» 

12 

[121 

[-1  -2  35]  [-1  -2  46] 

p'11 

'12 

[12] 

[136]  [245] 

o'Tlill 

r12 

[12] 

[-145]  [-236] 

«<y<».r'j» 

~12 

[12] 

[-1  -2  36]  [-1  -2  45] 

simple  choice 


etc. 


•i 
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much  the  same  across  the  four  pairs  of  positive/negative  relations:  "A  top-esteemed  block 
(consisting  of  7  actors)  unambivalently  positive  toward  itself,  in  conflict  with  ...  a  second,  more 
ambivalent  block  (also  of  7  actors)  to  which  is  attached  a  block  of  lasers  (of  size  4).  We 
label  these  blocks  or  subgroups  as 

G  =  (1.  2 . 7),  G2  =  (8.  9 .  14).  G  =  (15.  16.  17.  181  . 

We  therefore  aggregate  over  both  sets  of  relations  by  summing  the  four  sociomatrices  for  the 
positive  relations,  and  the  four  negative  relations,  to  obtain  one  positive  and  one  negative 
relation  matrix.  These  arrays,  given  in  Table  7,  have  entries  indicating  the  number  of  times 
actor  i  chooses  actor  j,  either  on  the  positive  or  negative  choices. 

The  techniques  we  have  used  to  analyze  0-1  sociomatrices  are  directly  applicable  here. 
Furthermore,  with  multiple  observations  on  each  actor,  the  asymptotic  basis  for  the  goodness- 
of-fit  statistics  stands  on  firmer  ground,  in  our  analysis  we  have  examined  the  18  X  18  X 
(2X2)  X  (2X2)  (corresponding  to  actor  X  actor  X  positive  X  negative)  version  of  this  table  and 
have  used  the  three  groups  given  above. 

A  priori,  some  choices  are  unlikely  to  be  reciprocated  across  relations,  and  we  should  find 
simple  choice  or  group  choice  model  to  be  an  adequate  summarization  of  the  flows  of 
attitudes,  both  positive  and  negative,  across  and  between  these  three,  substantively  different, 
subgroups.  A  summary  of  some  of  the  models  that  we  fit  to  this  network  is  given  in  Table  8. 

The  difference  in  goodness  of  fit  between  models  2  and  3  (which  in  a  sense  is  a  measure  of 
the  impact  of  the  grouping  effect)  is  statistically  significant  at  any  reasonable  level  of 
significance  and  is  typical  of  the  improvement  resulting  from  the  addition  of  simple  grouping 
parameters.  Similarly  the  difference  in  G2  values  for  models  4  and  S  is  also  large  but  is  less 
than  the  difference  in  degrees  of  freedom.  The  small  number  of  degrees  of  freedom  for 
models  is  caused  by  a  large  number  of  fitted  zeros.  Indeed,  any  model  which  includes  even 
an  overall  multiplex  (0J})  parameter  induces  at  least  2142  fitted  zeros  out  of  the  18  X  18  x  4 
X  4*  3184  cells  in  the  table,  and  the  goodness  of  fit  statistics  are  not  dramatically  improved 
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Table  7 


Sampson’s  Goister  Data  Aggregated  Over  Relations 


Aggregated  positive 


011010000000001000 
002033000100000000 
010020212100000000 
013042000100000000 
310404000000000000 
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Aggregated  negative 
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Summary  of  Fit  of  Several  Models  on  Sampson’s  Data 


Model 

Margins 

d.f. 

G2 

X2 

1. 

X<ij) 

[12] 

2295 

2835 

6252 

2. 

A(ii>  +  9  ♦  6 

1  2 

[12]  [3]  [4]  [5]  [6] 

2291 

1453 

3392 

3. 

x<ii>  +  0<y«»  +  0<y(j»  [12]  £_13]  [-14]  [.15]  t.16] 

2271 

1395 

3350 

+  +  0<y<j» 

2  2 

[-23]  [-24]  [-253  [-26] 

4. 

X<ij)  +  „<y<n>  +  .<y<p> 
r\ 2  '12 

[123  [-1363  [-2363 
[-145]  [-245] 

2259 

1368 

3088 

5. 

x«i>  +  yr*”.  r<i» 

[12]  [-1-236]  [-1-245] 

<1800 

1180 

2158 

by  the  inclusion  of  these  parameters.  We  note  the  very  large  differences  between  the  G2  and 
X2  values  in  Table  8.  which  go  in  the  opposite  direction  from  that  suggested  by  the  argument 
given  in  Larntz  (1978).  The  only  explanation  we  can  offer  is  the  presence  of  the  large 
proportion  of  observed  zero  cells. 

It  appears  that  model  4,  which  includes  different  reciprocity  effects  for  each  group,  provides 
a  reasonable  description  Of  the  data. 

A  more  thorough  analysis  of  the  data  for  this  network  should  include  a  detailed  study  of  the 
similarities  of  the  four  pairs  of  positive/negative  relations,  and  should  experiment  with  other, 
more  refined  partitions  of  the  actors,  as  suggested  by  Breiger,  Boorman,  and  Arabie  (1975). 
We  have  just  touched  the  surface  of  a  rather  large,  and  certainly  rich,  set  of  longitudinal  data. 
We  have  studied  the  monastery  structure  only  at  the  midpoint  of  a  12-month  period,  during 
which  a  crisis  over  theology  occurred,  and  the  group  split  up. 
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5.  CONCLUSION 

In  this  paper,  we  have  considered  a  variety  of  loglinear  models  for  micro  and  macro  analysis 
of  binary  social  network  data,  and  we  have  demonstrated  how  these  models  can  be  treated  in  a 
unified  manner.  The  models  we  have  considered  describe  important  aspects  of  the  data,  and 
we  have  had  the  good  fortune  to  be  able  to  take  advantage  of  relatively  easy  estimation 
methods  for  model  fitting. 

Unfortunately,  large  data  sets  and  corresponding  large  models  are  almost  axiomatic  with  the 
type  of  data  we  have  described  here.  Our  modelling  has  been  consciously  and  unconsciously 
influenced  by  what  it  is  possible  for  us  to  compute.  The  models  with  separate  group  effects 
seem  to  be  at  the  limits  of  the  computational  methodology  we  have  presented.  Other  models 
which  could  be  considered  interesting  (e.g.  additional  relationships  between  the  groups,  akin  to 
ordered  category  models  for  contingency  tables)  have  not  been  mentioned.  This  is  not  because 
we  find  them  uninteresting,  but  rather  because  the  prospect  of  numerically  fitting  such  models 
is  daunting. 

We  believe  we  have  indicated  how  more  general  models  could  be  formulated,  and  have 
presented  some  of  the  techniques  that  are  appropriate  for  fitting  the  models  to  actual  data. 
Further  advances  in  methodology  in  this  area  are  likely  to  be  as  dependent  upon  advances  in 
numerical  algorithms  or  computer  hardware,  as  they  will  be  on  new  statistical  ideas. 
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